Search CORE

802 research outputs found

Learning with Kernels

Author: Schölkopf B.
Smola A.
Publication venue
Publication date: 01/08/2007
Field of study

Classifying LEP Data with Support Vector Algorithms

Author: Mueller K. -R.
Schoelkopf B.
Smola A.
Soldner-Rembold S.
Vannerem P.
Publication venue
Publication date: 01/01/1999
Field of study

We have studied the application of different classification algorithms in the analysis of simulated high energy physics data. Whereas Neural Network algorithms have become a standard tool for data analysis, the performance of other classifiers such as Support Vector Machines has not yet been tested in this environment. We chose two different problems to compare the performance of a Support Vector Machine and a Neural Net trained with back-propagation: tagging events of the type e+e- -> ccbar and the identification of muons produced in multihadronic e+e- annihilation events.Comment: 7 pages, 4 figures, submitted to proceedings of AIHENP99, Crete, April 199

arXiv.org e-Print Archive

CiteSeerX

CERN Document Server

MPG.PuRe

Synthesis and Characterization of Copolymers of Lantanide Complexes with Styrene

Author: Berezhnytska A.
Ivakha N.
Savchenko I.
Smola S.
Publication venue: 'Taras Shevchenko National University of Kyiv'
Publication date: 05/11/2013
Field of study

Сopolymers of 2-methyl-5-phenylpentene-1-dione-3,5 with styrene in ratio 5:95, which containing Eu, Yb and Eu, Yb with 1,10-phenanthroline were synthesized at the first time. The luminescence spectra of obtained metal complexes and copolymers in solutions, films and solid state are investigated and analyzed. The solubilization of β-diketonate complexes with phenanthroline was shown to change luminescence intensity in such complexes. Obtained copolymers can be used as potential materials for organic light-emitting devices

French-Ukrainian Journal of Chemistry

A Kernel Method for the Two-sample Problem

Author: Borgwardt K.
Gretton A.
Rasch M.
Schölkopf B.
Smola A.
Publication venue: Max Planck Institute for Biological Cybernetics
Publication date: 01/12/2007
Field of study

We propose a framework for analyzing and comparing distributions, allowing us to design statistical tests to determine if two samples are drawn from different distributions. Our test statistic is the largest difference in expectations over functions in the unit ball of a reproducing kernel Hilbert space (RKHS). We present two tests based on large deviation bounds for the test statistic, while a third is based on the asymptotic distribution of this statistic. The test statistic can be computed in quadratic time, although efficient linear time approximations are available. Several classical metrics on distributions are recovered when the function space used to compute the difference in expectations is allowed to be more general (eg.~a Banach space). We apply our two-sample tests to a variety of problems, including attribute matching for databases using the Hungarian marriage method, where they perform strongly. Excellent performance is also obtained when comparing distributions over graphs, for which these are the first such tests

UCL Discovery

MPG.PuRe

Local Ranking Problem on the BrowseGraph

Author: Andersen R.
Bharat K.
Boldi P.
Chiarandini L.
Cho J.
Davis J. V.
Gyöngyi Z.
Lehmann J.
Page L.
Smola A. J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/05/2015
Field of study

The "Local Ranking Problem" (LRP) is related to the computation of a centrality-like rank on a local graph, where the scores of the nodes could significantly differ from the ones computed on the global graph. Previous work has studied LRP on the hyperlink graph but never on the BrowseGraph, namely a graph where nodes are webpages and edges are browsing transitions. Recently, this graph has received more and more attention in many different tasks such as ranking, prediction and recommendation. However, a web-server has only the browsing traffic performed on its pages (local BrowseGraph) and, as a consequence, the local computation can lead to estimation errors, which hinders the increasing number of applications in the state of the art. Also, although the divergence between the local and global ranks has been measured, the possibility of estimating such divergence using only local knowledge has been mainly overlooked. These aspects are of great interest for online service providers who want to: (i) gauge their ability to correctly assess the importance of their resources only based on their local knowledge, and (ii) take into account real user browsing fluxes that better capture the actual user interest than the static hyperlink network. We study the LRP problem on a BrowseGraph from a large news provider, considering as subgraphs the aggregations of browsing traces of users coming from different domains. We show that the distance between rankings can be accurately predicted based only on structural information of the local graph, being able to achieve an average rank correlation as high as 0.8

arXiv.org e-Print Archive

CiteSeerX

Crossref

On landmark selection and sampling in high-dimensional data analysis

Author: Blackburn J.
Deshpande A.
Drineas P.
Elgammal A.
Fowlkes C.
Lee K.-C.
Lee K.-C.
Liu R.
Ouimet M.
Platt J. C.
Smola A. J.
Talwalkar A.
Williams C. K. I.
Publication venue: 'The Royal Society'
Publication date: 24/06/2009
Field of study

In recent years, the spectral analysis of appropriately defined kernel matrices has emerged as a principled way to extract the low-dimensional structure often prevalent in high-dimensional data. Here we provide an introduction to spectral methods for linear and nonlinear dimension reduction, emphasizing ways to overcome the computational limitations currently faced by practitioners with massive datasets. In particular, a data subsampling or landmark selection process is often employed to construct a kernel based on partial information, followed by an approximate spectral analysis termed the Nystrom extension. We provide a quantitative framework to analyse this procedure, and use it to demonstrate algorithmic performance bounds on a range of practical approaches designed to optimize the landmark selection process. We compare the practical implications of these bounds by way of real-world examples drawn from the field of computer vision, whereby low-dimensional manifold structure is shown to emerge from high-dimensional video data streams.Comment: 18 pages, 6 figures, submitted for publicatio

arXiv.org e-Print Archive

Crossref

PubMed Central

UCL Discovery

A framework for space-efficient string kernels

Author: A Apostolico
A Apostolico
AJ Smola
AM İleri
B Chor
D Belazzougui
G Reinert
GE Sims
J Herold
J Qi
J Shawe-Taylor
M Crochemore
R Chikhi
S Chairungsee
Publication venue
Publication date: 23/02/2015
Field of study

String kernels are typically used to compare genome-scale sequences whose length makes alignment impractical, yet their computation is based on data structures that are either space-inefficient, or incur large slowdowns. We show that a number of exact string kernels, like the

k

-mer kernel, the substrings kernels, a number of length-weighted kernels, the minimal absent words kernel, and kernels with Markovian corrections, can all be computed in

O(nd)

time and in

o(n)

bits of space in addition to the input, using just a

\mathtt{rangeDistinct}

data structure on the Burrows-Wheeler transform of the input strings, which takes

O(d)

time per element in its output. The same bounds hold for a number of measures of compositional complexity based on multiple value of

k

, like the

k

-mer profile and the

k

-th order empirical entropy, and for calibrating the value of

k

using the data

arXiv.org e-Print Archive

Crossref

Deep Learning for Forecasting Stock Returns in the Cross-Section

Author: A Subrahmanyam
AJ Smola
C Krauss
CR Harvey
D Olson
GS Atsalakis
I Goodfellow
L Breiman
L Kryzanowski
Q Cao
RD McLean
S Soni
Y LeCun
Publication venue
Publication date: 12/06/2018
Field of study

Many studies have been undertaken by using machine learning techniques, including neural networks, to predict stock returns. Recently, a method known as deep learning, which achieves high performance mainly in image recognition and speech recognition, has attracted attention in the machine learning field. This paper implements deep learning to predict one-month-ahead stock returns in the cross-section in the Japanese stock market and investigates the performance of the method. Our results show that deep neural networks generally outperform shallow neural networks, and the best networks also outperform representative machine learning models. These results indicate that deep learning shows promise as a skillful machine learning method to predict stock returns in the cross-section.Comment: 12 pages, 2 figures, 8 tables, accepted at PAKDD 201

arXiv.org e-Print Archive

Crossref

Robust artificial neural networks and outlier detection. Technical report

Author: Andrei Kelarev
Cederman D
Gleb Beliakov
Huber PJ
John Yearwood
Makela MM
Mammadov MA
Masters T
Powell MJD
Press AH
Rousseeuw PJ
Rusiecki A
Sengupta S
Smola AJ
Publication venue: 'Informa UK Limited'
Publication date: 02/10/2011
Field of study

Large outliers break down linear and nonlinear regression models. Robust regression methods allow one to filter out the outliers when building a model. By replacing the traditional least squares criterion with the least trimmed squares criterion, in which half of data is treated as potential outliers, one can fit accurate regression models to strongly contaminated data. High-breakdown methods have become very well established in linear regression, but have started being applied for non-linear regression only recently. In this work, we examine the problem of fitting artificial neural networks to contaminated data using least trimmed squares criterion. We introduce a penalized least trimmed squares criterion which prevents unnecessary removal of valid data. Training of ANNs leads to a challenging non-smooth global optimization problem. We compare the efficiency of several derivative-free optimization methods in solving it, and show that our approach identifies the outliers correctly when ANNs are used for nonlinear regression

arXiv.org e-Print Archive

Deakin Research Online

Crossref

Federation ResearchOnline

Neuropathology in COVID-19 autopsies is defined by microglial activation and lesions of the white matter with emphasis in cerebellar and brain stem areas

Author: Julian A. Stein
Manuel Kaes
Sigrun Smola
Sigrun Smola
Walter J. Schulz-Schaeffer
Publication venue: 'Frontiers Media SA'
Publication date: 01/07/2023
Field of study

IntroductionThis study aimed to investigate microglial and macrophage activation in 17 patients who died in the context of a COVID-19 infection in 2020 and 2021.MethodsThrough immunohistochemical analysis, the lysosomal marker CD68 was used to detect diffuse parenchymal microglial activity, pronounced perivascular macrophage activation and macrophage clusters. COVID-19 patients were compared to control patients and grouped regarding clinical aspects. Detection of viral proteins was attempted in different regions through multiple commercially available antibodies.ResultsMicroglial and macrophage activation was most pronounced in the white matter with emphasis in brain stem and cerebellar areas. Analysis of lesion patterns yielded no correlation between disease severity and neuropathological changes. Occurrence of macrophage clusters could not be associated with a severe course of disease or preconditions but represent a more advanced stage of microglial and macrophage activation. Severe neuropathological changes in COVID-19 were comparable to severe Influenza. Hypoxic damage was not a confounder to the described neuropathology. The macrophage/microglia reaction was less pronounced in post COVID-19 patients, but detectable i.e. in the brain stem. Commercially available antibodies for detection of SARS-CoV-2 virus material in immunohistochemistry yielded no specific signal over controls.ConclusionThe presented microglial and macrophage activation might be an explanation for the long COVID syndrome

Directory of Open Access Journals